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Abstract 

Magnetoencephalography (MEG) provides dynamic spatial-temporal insight of neural activ- 
ities in the cortex. Because the number of possible sources is far greater than the number of 
MEG detectors, the proposition to localize sources directly from MEG data is notoriously ill- 
posed. Here we develop an approach based on data processing procedures including clustering, 
forward and backward filtering, and the method of maximum entropy. We show that taking as 
a starting point the assumption that the sources lie in the general area of the auditory cortex 
(an area of about 40 mm by 15 mm), our approach is capable of achieving reasonable success 
in pinpointing active sources concentrated in an area of a few mm's across, while limiting the 
spatial distribution and number of false positives. 
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1 Introduction 

Magnetoencephalography (MEG) records magnetic fields generated from neurons when the brain is 
performing a specific function. Neural activities thus can be noninvasively studied through analyzing 
the MEG data. Since the number of neurons (unknowns) are far larger than the number of MEG 
sensors (knowns) outside the brain, the problem of identifying activated neurons from the magnetic 
data is ill posed. The problem becomes even more severe when noise is present. 

A first and essential step in surpassing the obstacle of ill-posedness is to rely on prior knowledge 
of the general area of active current sources producing the MEG data. Often, this prior knowledge 
is provided by functional magnetic resonance imaging (fMRI) experiments. The two kinds of exper- 
iments complement each other, MEG has high temporal resolution (about 10~^s) but poor spatial 
resolution, whereas fMRI has high spatial resolution fT\ but poor temporal resolution (about 1 s). 
Consider auditory neural activity. FMRI shows that such activity is concentrate in the auditory 
cortex, two 40 mm by 15 mm areas respectively located on the two sides of the cortex (Fig.[TJj). Be- 
cause of poor temporal resolution, fMRI cannot resolve the rapid successive firing of the (groups of) 
neurons, instead it shows large areas of the auditory cortex lighting up. Under MEG, the same au- 
ditory activity will be detected by high time-resolution sensors (Fig. [TJl) which however collectively 
cover an area (when projected down to the cortex) much greater than the auditory cortex. 
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Figure 1: Left, side view of the human cortex; the frontal lobe points to the left. The region of 
interest, the auditory cortex, is marked by the dark rectangular. Right, schematic setup of an MEG 
experiment. The occipital lobe (right side in graph) is closer to the sensors because the person being 
tested is lying face- up. 

Although many methods have been proposed to attack the ill-posed problem, including dipole 
fitting, minimum- norm- least-square (MNLS) [2j, and the maximum entropy method (ME) [3], im- 
provements have been limited. One reason for this may be that these methods do not properly 
include anatomic constrains. In this work, we propose a novel approach to analyze noisy MEG data 
based on ME that pays special attention to obtaining better priors as input to the ME procedure 
by employing clustering and forward and backward filtering processes that take anatomic constrains 
into consideration. We demonstrate the feasibility of our approach by testing it on several simple 
cases. 

2 The ill-posed inverse problem 

Given the set of current sources (dipole strengths) {ri\i = 1, 2, • • • , Nr} at sites {zi\i = 1, 2, • • • , A^^}, 
the magnetic field strength m(x) measured by the sensor at spatial position x is given by, 

m(x) = A{x.)iri + n(x) = i(x) • f + n(x) (1) 

i=l 

where the function A(x)i, which is a function of z^, is derived from the Biot-Savart law in vacuum, 
n(x) is a noise term, and a hatted symbol denotes an A/'^-component vector in source space. We 
consider the case where there are Nm sensors at locations Xq,, a=l,2,- • • , Nm- To simplify notation, 
we write m as an A^^-component vector whose a^^ component is m"=m(xQ,), similarly for A and 
n. Then Eq. ([T]) simplifies to 

m = A • f + n (2) 

In practice, magnetic field strengths are measured at the Nm sensors and the inverse problem is to 
obtain the set of Nr dipole strengths with Nr >> Nm- The presence of noise raises the level of 
difficulty of the inverse problem. 

3 Methods 

Human Cortex and MEG Sensors. In typical quantitative brain studies, the approximately 
10^^ neurons in the cortex are simulated by about 2.4x10^ current dipoles whose directions are set 
parallel to the normal of the cortex surface [4^ 5j. For this study we focus on the auditory cortex, 
the area marked by the 40 mmx 15 mm rectangular shown in Fig. [Hj that contains 2188 current 
dipoles. 

In typical MEG experiments the human head is surrounded by a hemispheric tiling of magnetic 
field sensors called superconducting quantum interference devices (SQUIDS). In the experiment we 
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consider, there are 157 sensors, each composed of a pair of co-axial coils of 15.5 mm diameter 50 
mm apart, with the lower coil about 50 mm above the scalp [6j. The centers of the sensors are 
represented by the gray dots in Fig. [TJl. Details of geometry and of the A-matrixes in Eq. ([1]) are 
given in [7J. 

Artificial MEG data and Noise. We use artificial MEG data generated by the forward equa- 
tion, Eq. ([U), from sets of current dipoles (to be specified below) in a small area (black circles in 
Figs.[2fl) within the auditory cortex (the gray "rectangular" region in Figs.[2)L, enlargement shown 
in Figs. [2fl). A site-independent white Gaussian noise is linearly superimposed on the MEG data. 




Figure 2: Left, the circular and triangular symbols are the positions of the 31 sensors with detectable 
signals, including the 7 (triangular) with signals above threshold. Sensors with magnetic fiux going 
into (out) the page are solid (hollow). Right, detail of auditory cortex. The dark (orange in color) 
circles in the top-right corner indicate the general area of active sources used to generate artificial 
MEG data. 

The signal to noise ratio (SNR) is defined as 

SNR = -lOlogioWnmax \\y\\nimax\f (3) 

where nimax is the amplitude at the sensor receiving the strongest noiseless MEG signal, and Umax 
is amplitude of the strongest simulated noise. In this study we have m^aa?=7.4 fT (fT= femto-Tesla) 
and n^aa;=0.05m^aa;=0.37 fT, so that SNR=26 on each individual run. The artificial MEG data 
is generated by giving a current of 10 nA (nano-ampere) to each of the sources in a source set (see 
below), running the forward equation with noise 10 times and taking the averaged strengths at the 
sensors. The averaging has the effect of reducing the effective Umax by a factor of a/TO, and yielding 
an enhanced effective signal to noise ratio of SNR^=36. 

Using a threshold of Ts=l^nmax=^-^ fT we select a subset Ms of 7 "strong signal" sensors. 
This implies a minimum value of SNR^=32.9 (with averaging) on each sensor in the set. Given the 
(assumed) normal distribution of noise intensity, this selection implies that at 99.99% confidence 
level the signals considered are not noise. Similarly we use a threshold of Tc=6n^aa?=2.2 fT to 
select a subset Mc of 31 "clear signal" sensors, with minimum values of SNR' =25.6 on each sensor 
in the set. In actual computations below, we reduce the sensor space to one that include only those 
in the set Mc- In practice, the reduction replaces m. A, and n by m'. A', and n', respectively. 

Receiver Operating Characteristics Analysis. We evaluate the goodness of our results using 
receiver operating characteristics (ROC) analysis [9j, in which the result is presented in the form 
of a plot of the true positive rate (5*^, sensitivity) versus the false positive rate (or 1-5^, where Sp 
is the specificity). Let R be the total solution space of current sources, T the true solutions, or 
actual active sources, and P the positives, or the predicted active sources. Then F=R-T is the false 
solutions, TP={Pf]T) the true positives, FP={Pf]T)-T the false positives, and FN=R-{Pf]T) 
the false negatives. By definition Sn=TP/T={Pf]T)/T and l-Sp=FP/F={{P\jT) -T)/{R-T). 
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cluster size 

Figure 3: Distribution of cluster size in the clustering of 2188 sources into 250 clusters. 

Intuitively, a good solution is one such that maximizes Sn while minimizing 1-Sp. In a null theory, 
the positives will fall randomly into hence TP/T=l-Sp^ or Sn=^-Sp. Therefore, the merit of 
model producing a piece of data, showing on an ROC plot is measured by the difference 

between Sn and 1-Sp. In general, when a model is used to generate a curve in an ROC plot, the 
"area under the curve" (AUC), or the area between the curve and the Sn=l-Sp line, is a measure 
of merit of the model [lOj. 

Clustering and Sorting. Although implicit in the MRI head model introduced in Fig. [T] is a 

dramatic reduction of the number of neurons and the complexity of the cortex, the remaining number 
of effective neurons is still far greater than the number of detectors. We use a clustering algorithm [8] 
to further decrease the number of effective sources, in which the sources are partitioned according to 
spatial proximity and similarity in orientation into the set of Nc clusters C = {Cu\u = 1^2^ ^ Nc}, 
as follows. We require sources within a cluster to lie with a spatial radius of 5 mm and define 
{Nu = ^i^c number of current sources in cluster Cu) 

^« = E (4) 

ieCu 

as the "strength" of the A-matrices in cluster C^, 

du=Yl \NuA'^-A'J/Nu (5) 

as the "radius" - in the space of sensors - of cluster C^, and 

Z)„= E \X-A'J/{Nc-l) (6) 

c^ec 

as the average inter-cluster distance between Cu and all the other clusters. The clustering, including 
Nc^ is determined by requiring that 

du/Du<-fc. Vii = 1,2,--- ,Ar^, (7) 

where 7c is a parameter that controls the average cluster size; a smaller value of 7c implies smaller 
and more numerous clusters. In the limit 7c ^0 every cluster will consist of a single source and 
Nc^Nr^ or 2188 in the present case. A clustering obtained with 7c=l/7 was used in this work. It 
partitions the 2188 sources into A/'c=250 clusters, whose size distribution is shown in Fig. [3l 

The clustering results in the replacement of original source distribution by a coarse-grained 
distribution of virtual source-clusters whose Nr A'^-matrices are given by Nc A'Js. The clustering 
reduces Eq. ([2]) to 

m' = A: -rc^ n' (8) 



4 



which has the same form as Eq. ([2]) except that here the hatted vectors have only Nc components 
and each of Nc components in rc denotes the strength of the current dipole representing a cluster. 

It is convenient to sort the cluster set C according to the field strength of the clusters. Since the 
field strength depends on the where it is measured, the sorted order will be sensor-dependent. We 
denote the sorted set for sensor a by Thus we have: 

C^"^ = {CuM^ = 1,2,- ■■ ,Nc}, |^:,J > IXJ if w« < VaGMc. (9) 

Forward Filtering. A key in improving the quality of the solution of an inverse problem is to 
reduce the number of false positives. In the MEG experiments under consideration, the plane of 
the sensors are generally parallel to the enveloping surface of the cerebral cortex. Such sensors are 
meant to detect signals emitted from current sources in sulci on the cortex, and are not sensitive 
to signals from sources in gyri. In practice, in our test cases T will be composed of sulcus sources. 
Therefore, if we simply remove those clusters having the weakest strengths, we will reduce FP at a 
higher rate than TP. 

Given a positive fractional number ^ < 1, we use it to set an integer number < Nc^ and use 
A^^ to define the truncated sets 

^J"^ = {aj^« = l,2,---,Arj, yaeMs. (10) 
The integer is determined by regression by demanding the union set 

Ri= [J rI''^=^R (11) 

aeMs 

to be a fraction ^ of R. We call this forward filtering process of reducing the pool of possible 
positives from R to R^ the mostly sulcus model (MSM). About 25% of current sources in R lie in 
gyri. Therefore, if we set ^=0.75, very few potentially true sources will be left out. It turns out that 
in the region where 1-Sp is only slightly less than unity, setting P to R^ can offer the best result. 

Backward Filtering. Another way of reducing the pool of positives is to limit them to those 
clusters, with unit current strength, whose \A'\ value is greater than a threshold value Aq at all the 
sensors in Ms. This yields, for each a, a reduced set with Na < Nc clusters: 

^>"^ = {Cu^ = 1, 2, • • • , N^}^, \/ aeMs. (12) 

Now we let Rshm-, the pool of positives for the "simple head model" (SHM), be the intersection of 
the reduced sets, 

Rshm = f] 4"^- (13) 

Rshm represents a coarse-grained, simplified cortex tailored to the MEG data at hand: every source- 
cluster in Rshm has a relatively high probability of contributing significantly to all the sensors in 
Ms. A hypothetical case in which Ms is composed of the two sensors, ai and a2, is depicted in 
Fig.gl 

For this paper Aq is set to be 4.5 fT. Then the A^^^'s have values 113, 134, 102, 126, 103, 119, 
and 64, respectively, for the 7 sensors in M^, and the simple head model set Rshm contains 240 
source currents, or about 11% of the total number of current dipoles in the auditory region. 

The Maximum Entropy Method. The maximum entropy (ME) method is a method for deriving 
the "best" solution in ill-posed problems 0[IIl[Il[I3[H[l3[Ii[I71[T8l[19^ Generally, the equation 
that admits multiple solutions is treated as a constraint and, given a prior probability distribution of 
solutions, the method finds a posterior probability distribution by maximizing the relative entropy 
of the probability distributions. When applied to the MEG case, Eq. (|8]) (or Eq. ([2]) without 
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Figure 4: Illustration of backward filtering, when Mq is composed of the two sensors, a^ and a^?.. 
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Figure 5: Data processing flowchart for producing a prior input for the maximum entropy method. 



clustering) is used to constrain the posterior propbability distribution for a f that is the "best" ME 
solution, given m (measured) and n (presumed or otherwise obtained). The procedure is tedious 
but standard and an implementation was reported in [20j. Here we only describe how the prior 
probability distribution of solutions are determined in this work. 

We test several procedures ranked by their levels of complexity: (1) Simple ME (ME). All the 
2188 individual dipoles are included in the prior P. Here as in all other cases, in ME iteration 
involving sensors, only those in the "clear signal set" Mc are included. (2) ME with clustering 
but not filtering (C-ME). Cluster are treated as units of sources and all clusters are included in the 
prior P. (3) ME with filtering but not clustering (F-ME). Individual dipoles are treated as units of 
sources but only those in or Rshm (whichever is the smaller set) are included in the prior P. (4) 
ME with clustering and filtering (C-F-ME). Cluster are treated as units of sources but only those 
in or Rshm are included in the prior P. Fig. [5] is the flowchart for computation for the above 
procedures. In each case the set of positives, P, hence Sp^ is varied during the implementation of 
ME by a threshold on the strength of source dipoles for acceptance into the set. The exception is 
when the "F" procedure is taken. In this case, when 0<6'p<0.25, the set of positives, P, is directly 
set equal to R^ as described in Eq. (pT|) . without going through the ME procedure. When 6'p>0.25 
the procedure is switched to SHM followed by ME. In addition, we compare ME and MNLS [2j. 
When MNLS is involved the flowchart is the same as given in Fig. [5l with ME replaced by MNLS. 
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Figure 6: Left, true cluster set of sources (solid circles) in test 1. Right, ROC plot for test 1. 



4 Results and discussions 

We report three preliminary tests exploring the properties of the procedures. 

Test 1. The true set T is one cluster containing 13 sources covering an area of 4 mm by 1 mm 
(Fig. [6)L). The computed results, shown as an ROC plot, are given in Fig. [Bp.- Not shown are 
results l-5'p>0.20, when all four procedures give 5'n=l. It is seen that C-F-ME gives the best result: 
Sn=^ for the entire range of values for 1-Sp. C-ME and F-ME are excellent when 1-S'p>0.025, but 
completely fail to identify the true sources when l-6'p<0.025. In comparison, ME performs not as 
well as C-ME and F-ME when 1-Sp>0.025 but is better when smaller values of 1-Sp. Note that the 
R set contains 2188 sources and the T set 13 sources. So even at 1-5*^=0. 025 there are still 54 false 
positive {FP) sources. 

Test 2. The true set T is composed of two clusters containing 12 and 7 sources, respectively, covering 
an area 5.5 mm by 1 mm (Fig.[3L). The ROC plot (Fig.[3l) shows better results are obtained when 
F is involved (F-ME and C-F-ME). When clustering is involved (C-ME and C-F-ME) changes in 
Sn are discrete. This is because P may take only take four values, 1 when P contains both true 
clusters, 0.63 or 0.37 when it contains of the two, and when it contains none. Recall that clustering 
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Figure 7: Left, the two true cluster sets of sources (solid circles and hollow squares) in test 2. Right, 
ROC plot for test 2. 
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Figure 8: Left, the 42 source dipoles in test 3. Right, ROC plot for test 3; only results with l-6'p<0.2 
are shown. 

simplifies the organization of the sources {R) but does not reduce the prior positives (P). F reduces 
the prior P but does not simplify R. C-F does both. That procedures with F are clearly better than 
those without highlights the paramount importance of a better prior in all but trivial situations 
when ME is employed. 

Test 3. The true set T is composed of 42 active sources covering an area approximately 3 mm by 2 
mm (Fig. [8)L). They are distributed in eight clusters (C^, u=l to 8) containing 18, 31, 15, 12, 9, 23, 
18, and 17 sources, respectively, for a total of 143 sources. The intersection of these clusters with T 
are 4, 11, 4, 3, 4, 4, 6, and 6 sources, respectively. If clustering is applied, the minimum value for 
1-Sp is 0.047 when 6'n=l. The ROC shown in Fig.[8tl shows that for this relatively complicated case 
simultaneous high Sn and Sp is difficult to achieve; we obtain Sn<0.6 for l-6'p<0.20 in all procedures. 
Here again, procedures with F, which generate better priors^ yield more accurate positives than those 
without. It is worth pointing out the accuracies of the positives given by F-ME and C-F-ME are 
very similar in almost the entire range of 1-6'p, but F-ME is decisively better than C-F-ME when 
1-Sp is less than 0.01. The last effect brings out an inherent weakness of clustering. If the active 
sources (the T set) are spread out in more than one cluster and and if the union of the clusters 
is greater than the T set, then a solution with a non-null P {Sny^O) and null FP (^l-Sp=0) is not 
possible. In contrast, such an outcome is at least possible without clustering. As it happened, for 
the case at hand, at 1-Sp^O^ the P set for C-F-ME is cluster C4 with 12 sources, of which 3 belongs 
to T, yielding 6'n=3/42=0.071 and l-6'p=9/2146=0.0042. 

Fig. [9] shows the 21 strongest sources (the "strong set"), not necessarily all in the positive set, 
given by the four procedures when l-6'p<0.02 (the sets do not change much in this range of 1-Sp). 
In the case of C-F-ME (panel (a)), the strong set comes from clusters C3 and (^4, both of which lie 
in the vicinity of T. When 1-5'^ is lowered (by raising the P-acceptance threshold) beyond a certain 
point Cs is eliminated, leaving only C4 in P in the situation discussed above. In the case of F-ME 
(panel (b)), some individual sources in the strong set are not in the vicinity of T. However, these 
are eliminated when 1-Sp is lowered, and the remaining individual sources happen to have a FP 
that is smaller than that in C-F-ME. 

ME versus MNLS. We compare the effectiveness of the MNLS procedure against ME using the 
true set of test 3. Fig. [10)L shows the ROC plots for C-F-ME (same plot as in Fig. [8]), C-F-MNLS 
and MNLS. We observe that MNLS is worse than ME (Fig. [8]), C-F-MNLS is better than MNLS, 
and C-F-ME is better than C-F-MNLS. This shows, at least for the case tested, clustering is also 
beneficial to MNLS and, other things being equal, ME is more effective than MNLS. In the last 
instance the PP's of MNLS cover a large area of the auditory cortex. The strong sets in the three 
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Figure 9: The strong sets (21 strongest sources) in test 3 at l-5'p<0.02. In each case the gray 
(orange in color) blotch near the top-right corner is the T set of 42 active sources, (a) C-F-ME 
(solid squares), (b) F-ME (bullets), (c) C-ME (triangles), (d) ME (open circles). 
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Figure 10: Left, ROC plots for MNLS, C-F-MNLS, and C-F-ME. Set T is the 42 active sources of 
test 3. Right, the strong set (dark symbols) at 1-S'j,<0.02. (a) C-F-ME (solid square); (b) C-F- 
MNLS (open circle); (c) CMNLS (triangle). T is the set of grey (orange) circles near the top-right 
corner in panels (a-c). 
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procedures are given in Fig. IIQR. If we take these dipoles as the positive set then we obtain the 
results given in Table [H 
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0.000 
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l-Sp 


ME 


0.0096 


0.0087 


0.0073 


0.0064 




MNLS 


0.0096 


0.0096 


0.0087 


0.0096 



Table 1: Comparisons of ME and MNLS using the set of 42 active sources of test 3. In each procedure 
the positive set is composed of the 21 current dipoles with the strongest currents. The third column 
gives results with neither clustering nor F. 

In summary, we have employed several simple cases to illustrate that even when we may use 
fMRI data to tell us where the general region of the source currents are, the nature of the inverse 
problem is such that the challenge to precisely pinpoint sources located in a small area is still great. 
We showed that clustering tends to limit the area covered by false positives, and filtering is effective 
for generating better priors for ME. When we use ME implemented by these procedures, we can 
achieve part partial success in pinpointing sources concentrated in an area the size of a few mm 
across, comparable to the spatial resolution of fMRI [Ij, while limiting the spatial distribution and 
number false positives. Considering that the area of the active sources is miniscule compared to 
the auditory cortex, which is itself yet much smaller than cortical surface covered by the sensors 
receiving clear or even strong signals, the achieved level of success reported here, even as it still 
leaves much to be desired, is a vast improvement over what can be inferred simply from the pattern 
of sensors with strong signals. 

This work is supported in part by grant nos. 95-231 l-B-008-001 and 95-291 1-L008-004 from the 
National Science Council (ROC). 
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